Sample complexity について

Words near each other

・ Sampit conflict
・ Sampit River
・ Sampit River (Indonesia)
・ SAMPL
・ Sample
・ Sample (graphics)
・ Sample (material)
・ Sample (Sakanaction song)
・ Sample (statistics)
・ Sample (surname)
・ Sample Analysis at Mars
・ Sample and Data Relationship Format
・ Sample and Hold
・ Sample and hold
・ Sample Collection for Investigation of Mars
・ Sample complexity
・ Sample entropy
・ Sample Estate
・ Sample exclusion dimension
・ Sample grade
・ SAMPLE history
・ Sample in a Jar
・ Sample injector
・ Sample library
・ Sample Magic
・ Sample matrix inversion
・ Sample maximum and minimum
・ Sample mean and covariance
・ Sample Nunataks
・ Sample People

Dictionary Lists

mini英和辞書

翻訳と辞書　辞書検索 [ 開発暫定版 ]

スポンサードリンク

Sample complexity ：ウィキペディア英語版

Sample complexity

In machine learning, sample complexity is the number of examples needed for the estimate of a target function to be within a given error rate. The sample complexity of a machine learning algorithm characterizes its rate of consistency.
==Mathematical Setup==
Given a sample

S_n = \

drawn i.i.d. from a distribution

\rho

on some input space

\mathcal X \times \mathcal Y

, a supervised learning algorithm chooses a function

f:\mathcal X \to \mathcal Y

from some hypothesis class

\mathcal H

. A desirable property of the algorithm is that it chooses functions with small expected prediction error with respect to

\rho

and some loss function

V:\mathcal Y \times \mathcal Y \to \mathbb R_+

. Specifically, it is desirable to have a consistent algorithm, or an algorithm that generates functions whose expected loss or risk

\mathcal R(f) = \mathbb E_\rho(),

converge to the best possible risk

\mathcal R^*_\mathcal = \underset\mathcal R(f).

Formally, let

f_n

be the functions generated by an algorithm as the number of data points

n

grows. The algorithm is consistent if

\underset\mathbb P_(\mathcal R(f_n) - \mathcal R^*_\mathcal > \epsilon) \to 0

for all

\epsilon > 0

, where

\mathbb P_

denotes the i.i.d. sampling measure

\rho^n

.
The consistency property is nice, but it says nothing about how fast the risks converge. Since in practice one always deals with finite data, it is important to answer the question of how large a sample is needed to achieve a risk that is close, in the

\epsilon

sense, to the best possible for the function class. The notion of sample complexity answers this question. The sample complexity of a learning algorithm is a function

n(\rho,\epsilon,\delta)

such that for all

n \ge n(\rho,\epsilon,\delta)

\mathbb P_(\mathcal R(f_n) - \mathcal R^*_\mathcal > \epsilon) < \delta.

In words, the sample complexity

n(\rho,\epsilon,\delta)

defines the rate of consistency of the algorithm: given a desired accuracy

\epsilon

and confidence

\delta

, one needs to sample at most

n(\rho,\epsilon,\delta)

data points to guarantee that the risk of the output function is within

\epsilon

of the best possible, with probability at least

1-\delta

.

抄文引用元・出典: フリー百科事典『ウィキペディア（Wikipedia）』
■ウィキペディアで「Sample complexity」の詳細全文を読む

スポンサードリンク

翻訳と辞書 : 翻訳のためのインターネットリソース